Class phrase models for language modelling
نویسندگان
چکیده
Previous attempts to automatically determine multi-words as the basic unit for language modeling have been successful for extending bigram models 10, 9, 2, 8] to improve the per-plexity of the language model and/or the word accuracy of the speech decoder. However, none of these techniques gave improvements over the trigram model so far, except for the rather controlled ATIS task 8]. We therefore propose an algorithm, that minimizes the perplexity improvement of a bigram model directly. The new algorithm is able to reduce the trigram perplexity and also achieves word accuracy improvements in the Verbmobil task. It is the natural counterpart of successful word classiication algorithms for language modeling 4, 7] that minimize the leaving-one-out bigram perplexity. We also give some details on the usage of class nding techniques and m-gram models, which can be crucial to successful applications of this technique.
منابع مشابه
Automatic Acquisition of Phrase Grammars for Stochastic Language Modeling
Phrase based language models have been recognized to have an advantage over word based language models since they allow us to capture long span ning dependencies Class based language models have been used to improve model generalization and overcome problems with data sparseness In this pa per we present a novel approach for combining the phrase acquisition with class construction process to au...
متن کاملLearning Word Embeddings for Hyponymy with Entailment-Based Distributional Semantics
Lexical entailment, such as hyponymy, is a fundamental issue in the semantics of natural language. This paper proposes distributional semantic models which efficiently learn word embeddings for entailment, using a recently-proposed framework for modelling entailment in a vectorspace. These models postulate a latent vector for a pseudo-phrase containing two neighbouring word vectors. We investig...
متن کاملCategory Theory Foundation For Engineering Modelling
Category theory provides a formal foundation for engineering modelling, as well as, mathematics and science. Both structure and behaviour, as they occur in engineering models for manufactured products and biomedicine, can be embedded as axiom sets within a mathematical formalism, called Algos. The Algos language is a two sorted first order Horn clause theory based on topos language construction...
متن کاملParaphrastic Language Models
Natural languages are known for their expressive richness. Many sentences can be used to represent the same underlying meaning. Only modelling the observed surface word sequence can result in poor context coverage and generalization, for example, when using n-gram language models (LMs). This paper proposes a novel form of language model, the paraphrastic LM, that addresses these issues. A phras...
متن کاملHow Useful Are Early Economic Models?; Comment on “Problems and Promises of Health Technologies: The Role of Early Health Economic Modelling”
Early economic modelling has long been recommended to aid research and development (R&D;) decisions in medical innovation, although they are less frequently published and critically appraised. A review of 30 innovations by Grutters et al provides an opportunity to evaluate how early models are used in practice. The evidence of early models can be used to inform two types...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996